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Abstract 

We compare forecasts of United States inflation from the Survey of Professional 
Forecasters (SPF) to predictions made by simple statistical techniques. In nowcasting, 
economic expertise is persuasive. When projecting beyond the current quarter, novel 
yet simplistic probabilistic no-change forecasts are equally competitive. We further 
interpret surveys as ensembles of forecasts, and show that they can be used similarly to 
the ways in which ensemble prediction systems have transformed weather forecasting. 
Then we borrow another idea from weather forecasting, in that we apply statistical 
techniques to postprocess the SPF forecast, based on experience from the recent past. 
The foregoing conclusions remain unchanged after survey postprocessing. 

Key words and phrases: inflation; predictive distribution; reference forecast; statisti- 
cal postprocessing; Survey of Professional Forecasters. 

1 Introduction 

A wealth of societal decisions can benefit from accurate forecasts of future inflation, rang- 
ing from the setting of monetary and fiscal policies to negotiations of wage contracts and 
investment judgments. To predict inflation rates, various methods have been employed, 
including statistical time series techniques, methods based on term structures and yield 

1 



curves, and survey-based measures from consumers or professional experts. A prominent 
recent study argues that survey forecasts perform best ( |Ang et aL}|2007| ). 

The Survey of Professional Forecasters (SPF) is the leading quarterly survey of macroe- 
conomic variables in the United States (Zarnowitz, 1969; Croushore, 1993). It began in 
1968 and was conducted by the American Statistical Association and the National Bu- 
reau of Economic Research, before the Federal Reserve Bank of Philadelphia took over 
in 1990. The panel comprises university professors as well as private sector economists, 
who are asked each quarter to predict a range of macroeconomic variables for the current 
and each of the following four quarters. Our interest here is in inflation, in the form of the 
quarter-over-quarter change of the consumer price index expressed in annualized percent- 
age points, which we now refer to as an inflation rate. Figure [T] displays observed United 
States inflation rates since the third quarter of 1995 along with the SPF forecasts issued a 
year earlier. Following common practice, we talk of a prediction horizon of one quarter 
when referring to current quarter nowcasts, and to prediction horizons of two to five quar- 



ters for the following four quarters (Croushore and Stark, 2001). For example, forecasts 
issued a year earlier correspond to a prediction horizon of five quarters. 

The SPF panel forecast can be summarized to provide a single point forecast, where 
we also follow common practice and take it to be the median of the individual experts' 



predictions (Stark 2010). The traditional no-change forecast equals the most recent avail- 
able observed rate. This is the classical reference forecast in the economic literature and 
that used by the Federal Reserve Bank of Philadelphia. Here, we introduce a novel kind 
of simplistic reference forecast, which we call the probabilistic no-change forecast. It con- 
siders a rolling training period, consisting of the 20 most recent observed inflation rates, 
and takes the median thereof as a point predictor. This can be interpreted as a theoretically 
optimal point forecast under a white noise assumption for the inflation rates, whereas the 
traditional no-change forecast can be interpreted as optimal under a random walk model. 
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Figure 1 : SPF forecasts of United States inflation at a prediction horizon of five quarters. The small 
blue dots show the individual SPF experts' point forecasts, with the pale blue boxes indicating their 
range. The large purple dots represent the observed inflation rates. Note the extreme deflation in the 
fourth quarter of 2008. 



In Section[2]we compare the predictive performance of the SPF point forecast to the simple 
no-change forecasts. Essentially, in current quarter nowcasts economic expertise is persua- 
sive. At prediction horizons beyond the current quarter, the SPF forecast outperforms the 
traditional no-change forecast, but not the novel, equally simplistic probabilistic no-change 
forecast. 

The SPF forecast can also be interpreted as an ensemble forecast, similar to the ways 
in which weather and climate scientists have been using ensemble prediction systems with 



great success (Palmer 2002|. State of the art weather forecasting uses ensembles whose 
members are point forecasts from numerical weather prediction models, with the mem- 
bers differing in initial conditions and/or the specifics of the numerical model used. For 
ensemble weather forecasts, some form of statistical postprocessing is required to correct 



for model biases and insufficient representations of the forecast uncertainty (Gneiting and 
Raftery , 2005 ). From Figure [T] we see that similar to weather ensembles the SPF ensemble 
forecast is uncalibrated, in that too many observations fall outside the range of the ensemble 
forecast. 



Postprocessing methods in meteorology provide statistically corrected predictive prob- 
ability distributions for future weather quantities that condition on the ensemble forecast. 
From the predictive distribution, the probability of any event of interest can be computed, 



and one can issue the optimal point forecast under the loss function at hand (Diebold et al. 



1998; Engel berg et al.[|2009| ). We adopt this approach and develop statistical postprocess- 



ing methods for the SPF by using heteroscedastic regression and Gaussian mixture models. 
In Section [3] these methods are introduced in detail and their predictive performance is 
evaluated, with results that resemble those without postprocessing. Finally, in Section|4]we 
study the robustness of our results. Concluding remarks are given in Section [5j where we 
discuss the methodological as well as the economic and societal implications of our work. 



2 Predictive performance: SPF versus no-change forecasts 

The Survey of Professional Forecasters (SPF) panel comprises academic as well as private 
sector economists, who are asked to provide point forecasts for a range of macroeconomic 
variables. They are also requested to provide probability forecasts, but these refer to an- 
nual, rather than quarterly, percentage change, and thus are not considered here. The fore- 
casts are issued in the middle of a quarter for the current quarter (a prediction horizon of 
one quarter) and each of the following four quarters (prediction horizons of two quarters 
through five quarters). While Engel berg et al.| (|2009) show that the predictions are mostly 



consistent with the hypothesis that SPF panel members report their subjective means, me- 
dians or modes, they also note that SPF forecasters tend to give more favorable views of 
the economy than warranted by their subjective probabilities. 

Inflation rates based on the consumer price index (CPI) have been included in the 



survey since the third quarter of 1981, and forecast data are available on-line at http : 



//www.phil . f rb . org/econ/spf /| As realizing observations, we use the CPI vin- 



tage from the Real-Time Data Research Center of the Federal Reserve Bank of Philadelphia 



(Croushore and Stark, 2001 ), which is available from the third quarter of 1994 on. The tra- 
ditional and probabilistic no-change forecasts then use the most recent vintage available at 
the issuing time of the forecast, while the predictive performance is evaluated against the 
May 2010 vintage. To convert the original monthly observations of the CPI into annualized 
quarterly growth rates, we follow common practice by averaging the monthly observations 
of each quarter, and using the formula 

! " = ((^) 4 - 1 ) xl0 °- 

where z t is the observed quarterly CPI in quarter t, and y t is the observed quarter-over- 
quarter growth rate of the CPI, or simply, the inflation rate in quarter t, in percentage 
points. 

We now compare the predictive performance of the SPF median forecast to the tradi- 
tional and probabilistic no-change forecasts. Due to the dynamic data vintage, the tradi- 
tional no-change forecast does not exactly trail the observations, even though it does so 
approximately. The probabilistic no-change forecast uses the median of the 20 most recent 
observations available at the issuing time. Table [T] summarizes the predictive performance 
for the period from the third quarter of 1995 through the first quarter of 2010 in terms of 



the mean absolute error (MAE). In addition, we report the results of Diebold and Mariano 



(1995) tests of the hypothesis of equal predictive performance between the SPF forecast 
and the reference forecasts. In doing so, we provide the lower tail probability under the 
null hypothesis in percentage points, where a value of 00 indicates a lower tail probability 
less than or equal to 1%, a value of 01 a lower tail probability between 1% and 2%, . . . , 
and a value of 99 a lower tail probability exceeding 99%. Thus, values from 00 to 04 and 
95 to 99 correspond to a statistically significant difference at the 5% level for a one-sided 
test, and at the 10% level for a two-sided test. 

The table confirms the well known fact that the SPF median forecast outperforms the 
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Table 1 : Mean absolute error (MAE) for point forecasts of United States inflation from the third 
quarter of 1995 to the first quarter of 2010, in percentage points, along with the lower tail probability 
of the Diebold-Mariano test for equal predictive performance between the SPF forecast and the 
reference forecast. 



Forecast method 




Prediction horizon in quarters 




1 


2 3 4 5 


SPF 


0.89 


1.44 1.51 1.49 1.49 


Probabilistic no-change 


1.4599 


1.46 58 1.45 18 1.48 39 l-48 4 4 


Traditional no-change 


1-81-99 


2.06 96 2.OO91 2.O695 2.03 97 



traditional no-change forecast at all prediction horizons. The corresponding MAE differ- 
entials are statistically significant as well as substantial in size, reaching values close to a 
percentage point. At a nowcasting prediction horizon of a single quarter, the SPF median 
forecast also has substantially lower MAE than the probabilistic no-change forecast, even 
though the latter is more competitive than the traditional no-change forecast. Howeve, at 
prediction horizons from two to five quarters, the SPF forecast is unable to outperform the 
probabilistic no-change forecast, showing MAE values that are about equal for the two 
methods. 

Thus far, we have considered point forecasts, as opposed to probabilistic forecasts or 
predictive distributions, which are of ever increasing importance in a wide trans-disciplinary 
range of applications (TimmermannJ 2000; Gneiting, 2008). The SPF and the probabilis- 
tic no-change method provide predictive distributions in natural ways, in that they can be 
identified with the discrete probability measures that assign equal mass to each of the ex- 
perts, or each of the CPI observations in the training period, respectively. For example, 
the predictive distribution that corresponds to our standard version of the probabilistic no- 
change forecast assigns mass 1/20 to each of the 20 most recent inflation observations 
available at the issuing time. To obtain a predictive distribution associated with the tradi- 
tional no-change forecast, we take it to be Gaussian, with mean equal to the most recent 
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Table 2: Mean continuous ranked probability score (CRPS) for probabilistic forecasts of United 
States inflation from the third quarter of 1995 to the first quarter of 2010, in percentage points, 
along with the lower tail probability of the Diebold-Mariano test for equal predictive performance 
between the SPF forecast and the reference forecast. 



Forecast method Prediction horizon in quarters 





1 


2 


3 


4 


5 


SPF 


0.69 


1.16 


1.25 


1.26 


1.27 


Probabilistic no-change 


l-08 99 


1.10 2 1 


I.IO00 


I.IO00 


l-lloo 


Traditional no-change 


1.56 99 


I.6697 


1.51 82 


1.57bo 


1.47 87 



available observation, and variance equal to the empirical mean squared error (MSE) of the 
traditional no-change forecast over the rolling 20-quarter training period. 

To assess the predictive performance of the probabilistic forecasts, we use the contin- 
uous ranked probability score (CRPS), which is a decision theoretically coherent proper 



scoring rule, and reduces to the absolute error in the case of a point forecast (Matheson 



and Winkler, 1976} Gneiting and Raftery] 2007). If the predictive cumulative distribution 



function (CDF) is F and the observation y verifies, the CRPS is defined as 

/oo 
(F(x) - l{x > y}f dx, (1) 
-oo 

where l{x > y} denotes an indicator function that attains the value 1 if x > y and the 
value otherwise. |Grimit et al. (2006) showed that for a discrete probability measure F 



that puts mass 1 /M on each of x 1 , . . . , %, this can be written as 

^ M I M M 

crps(f,?/) = — km - y\ - Yl \ x ™ - x ™\- 

m=l m=l n=l 

Table|2]summarizes the predictive performance of the SPF, probabilistic no-change and 
traditional no-change forecasts in terms of the mean CRPS over the foregoing test period. 
Again, the SPF forecast outperforms the extant reference standard, namely the traditional 
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Figure 2: Mean absolute error (MAE, red solid line) and mean continuous ranked probability score 
(CRPS, red dashed line) for five-quarter ahead probabilistic no-change forecasts for United States 
inflation from the third quarter of 1995 to the first quarter of 2010, as functions of the length of 
the rolling training period. For comparison, we also show the MAE (blue solid line) and the mean 
CRPS (blue dashed line) for the SPF forecast. 



no-change forecast, at all prediction horizons. In nowcasts for the current quarter, the SPF 
forecast also shows substantially lower CRPS than the probabilistic no-change forecast. 
Beyond the current quarter in true forecasting mode the simplistic probabilistic no-change 
forecast outperforms the SPF experts. Furthermore, as Figure ^demonstrates, these results 
are robust to the choice of the length of the rolling training period for the probabilistic 
no-change forecast. 

A valid concern at this point is that the SPF panel is simply a collection of point fore- 
casts that is not necessarily meant to be taken as a discrete predictive distribution. As noted, 
this resembles the situation in weather forecasting where meteorologists use ensembles of 



point forecasts from numerical weather prediction models (Palmer, 2002), which are sub- 
ject to biases and dispersion errors, thus calling for statistical postprocessing ( jGneiting 
and Raftery , 2005[ ). Indeed, the Federal Reserve Bank of Philadelphia has recently exper- 



imented with Gaussian density forecasts that derive from the SPF panel ( |Stark[ 2010) and 
can be interpreted as postprocessing methods. In the next section, we take up this idea and 
develop statistical postprocessing techniques that are tailored to the SPF panel. 
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3 Survey postprocessing 



In weather forecasting, statistical postprocessing methods have been used with great suc- 
cess to improve the predictive performance of ensemble prediction systems, with het- 
eroscedastic regression ( jGneiting et al.[ |2005^ |Thorarinsdottir and Gneitingf |2010[ ) and 
Bayesian model averaging (Raft ery et al.[ |2005j |Sloughter et al.[ |2010[ ) being state of the 
art techniques. Here we introduce variants of these methods that are tailored to the SPF 
ensemble and inflation forecasts. In the SPF, the composition of the expert panel changes 
gradually over time, with individual members providing forecasts for about six years on 
average (Engelberg et al. , 2009[ ). Furthermore, there are missing forecasts for essentially 
all members even during the period when they are active. Thus, it is very difficult to as- 
sess the predictive performance of individual members, in contrast to what is commonly 
done in weather forecasting. Therefore, the predictive distributions obtained here depend 
on summary statistics derived from the SPF panel or the probabilistic no-change forecast. 
In this context, we denote the median and the variance of the SPF panel by fi SPF and <jI pf . 
Similarly, we write /Xpnc an d °pnc f° r me median and the variance of the probabilistic no- 
change (PNC) forecast with a rolling 20 quarter training period. The density function and 
the CDF of the standard normal distribution will be denoted by ip and $, respectively. 



3.1 Heteroscedastic regression 



Gneitin g~et al.| (|2005) and T horarinsdottir and Gneiting| ( |2010| ) proposed a statistical post- 
processing method for ensembles of point forecasts that uses heteroscedastic regression 



(HR) (Leslie et al. 2007), where the location parameter of the predictive distribution is a 
linear function of the ensemble member forecasts, and the scale parameter is a linear func- 
tion of the ensemble variance. Here, we adapt the method so that the postprocessed pre- 



dictive distribution is the asymmetric three-parameter two-piece normal distribution (John 
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1982) with density 



/tin (l/) = < 



( /o\ 1/2 

"§) 

2X l/2 
k \ 7T. 



a 1 + a 2 T 1 exp[ — — ] if y < /i, 



2^ 



fT , (To ) exp ( ] if y > fi, 



and CDF 



TPN 



(y) 



2cr 2 



a 1 - a 2 



+ 



if y < H, 



^[ V —^\ if y>^. 



(2) 



\.cr l + a 2 o\ + a 2 \ a 2 
The two-piece normal distribution has been used for density forecasts of United Kingdom 



inflation by the Bank of England since 1996 ( |Wallisj [T999, Eld er et al.[ |Autumn 2 005). 
Its asymmetry allows for a distinct treatment of upside and downside risks in inflation 
forecasting. If o\ < a 2 the distribution is right skewed, and both the mean and the median 
exceed its mode, /i. If o\ > o 2 the distribution is left skewed. 

We consider two variations of the HR approach. The first variant models the parameters 
of the two-piece normal distribution as functions of the SPF median and SPF variance, in 
that 

H = a + 6/Uspf, af = c 1 + d^p, a\ = c 2 + d 2 al PF , (3) 

and we refer to it as the HR model with SPF covariates. The second nests the first and 
posits 

jx = a+6i/i S pF+&2/ipNc, o"i = ci+dn(Jsp F +di20-p NC , a\ = c 2 +d 21 al PF +d 22 (jp NC . (4) 



We refer to the specification in (|4]) as the HR model with SPF and PNC covariates. In 
out-of-sample forecasting overfitting is heavily penalized, and it is an empirical question 
whether or not this more complex model can outperform the more parsimonious one. 
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Table 3: Minimum CRPS estimates for the HR density forecast (|4j) of United States inflation in the 
first quarter of 2008 at a prediction horizon of two quarters, using a rolling training period of 40 
quarters. The resulting parameter values for the two-piece normal distribution ([2]) are mean value 
of fi = 1.90, with standard deviations o\ = 0.59 and oi = 3.27, respectively. 



a b\ 


b 2 


Cl 


(in 




c 2 


^21 G?22 


0.36 0.53 


0.00 


0.00 


0.52 


0.00 


0.00 


0.00 3.10 



In |Gneiting et al. (2005) and |Thorarinsdottir and Gneiting| ( |20l0[ ), the parameters of 
the postprocessing model were estimated by minimizing the mean CRPS over the training 
period, and this was shown to yield slightly better predictive performance than maximum 
likelihood estimation. In an economic context, similar approaches have been discussed 
by Elliott and Timmermann (2008). We adopt this proposal, using formula ([8]) in the Ap- 
pendix, and performing the minimization numerically, via the Broyden-Fletcher-Goldfarb- 
Shanno algorithm as implemented in R ( R Development Core Team] 2008). Each pre- 
diction horizon requires its individual fit with distinct parameter estimates, using a rolling 
training period from the recent past. 

As an example, Figure [3] and Table [3] illustrate the HR density forecast with SPF and 
PNC covariates for the first quarter of 2008 at a prediction horizon of two quarters, fitted on 
a rolling training period of 40 quarters. The point forecasts of the SPF experts had median 
2.90 and standard deviation 0.82. The PNC ensemble had a higher median, at 3.30, and a 
higher standard deviation, at 1 .86. Table[3]shows the parameter estimates for the HR model 
([4]). The mode fi of the two-piece normal distribution ([2]) is determined by the SPF median, 
the downside risk o\ by the SPF spread, and the upside risk a\ by the PNC spread. Figure 
[3] shows the SPF and PNC ensembles along with the postprocessed density forecast, which 
is strongly right skewed, with mode at 1.90 and median at 3.67, higher than both the SPF 
median and the PNC median. The verifying inflation rate in the first quarter of 2008 was 
4.66 percentage points. 
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Figure 3: Density forecast for the United States inflation rate in the first quarter of 2008 at a predic- 
tion horizon of two quarters, using the HR model ([4]) with both SPF and PNC covariates. At bottom, 
the SPF panel (blue bars), which has median 2.90, and the PNC ensemble (red bars), with median 
3.25, are shown. The median of the density forecast is indicated by the solid orange line at 3.67, 
and the realizing inflation rate is represented by the purple dashed line at 4.66. 

3.2 Gaussian mixture models 



Bayesian model averaging (BMA) is a standard method for combining inferences (Hoeting 
etal.[|1999| ). Its use in the statistical postprocessing of ensemble weather forecasts was pro- 
posed by Raftery et al. (2005 ) and Sloughter et al. | ( 20 1 ) . In the normal mixture version of 
Raftery et al. ( 2005[ ), the BMA predictive density is a mixture of Gaussian densities, where 
the components are associated with individual ensemble member, and the mixture weights 
reflect the members' relative contribution to predictive skill over the training period. 

Here we use a similar idea, taking the postprocessed predictive distribution to be a 
mixture of two Gaussian components, with a CDF of the form 



F GM (y)=a^[ y -^)+(l-a)^' y -^ 



02 



(5) 
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In a first variant we put 



Ml — A'SPF; 01 — 0"SPF; A* 2 — /^PNC; ° 2 — OpNC; (6) 

so that the mixture weight a e [0, 1] is the only parameter to be estimated. This is our most 
parsimonious, standard Gaussian mixture (GM) model. In a second variant, we put 



fJ'l — Mspf, A*2 — Mpnc, (7) 

with a E [0, 1], u\ > and cr 2 > to be estimated. We refer to ^ as the GM model 
with variance adjustment. In our experience, bias correction of the location parameters 
deteriorates the predictive performance out of sample, and thus we do not present results 
for such models. The parameters are estimated by maximum likelihood via the expectation- 
maximization (EM) algorithm, based on training data from a rolling training period, as 
described by |Raftery et al.| ( [2005] ) and implemented in the R package ensembleBMA. 



An example of a GM predictive distribution with variance adjustment is given in Fig- 
ure]?} using a rolling training period of 40 quarters. 



3.3 Predictive performance 

In the SPF record, quarterly forecasts and observations of the CPI are available for 115 
quarters. We split the data such that the forecasts through the second quarter of 1995 are 
used solely for training purposes, while the methods are tested on the data thereafter. The 
CPI vintage data have been released quarterly since the third quarter of 1994 and it would 
thus not have been possible to issue real-time forecasts based on this data prior to the third 
quarter of 1994. For parameter estimation, we use a rolling training period of 40 quarters. 
The PNC ensembles used are based on 20 quarterly observations. 

Figure|5]illustrates the postprocessed predictive distributions using the HR method with 
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Figure 4: Density forecast for the United States inflation rate in the second quarter of 2005 at a 
prediction horizon of two quarters (in pale green), using the GM model ([7]) with variance adjustment. 
At bottom, the SPF panel (blue lines), which has median 2.20 and standard deviation 0.85, and the 
PNC ensemble (red lines), with median 3.05 and standard deviation 1.39, are shown. The median 
of the density forecast is indicated by the green solid line at 2.49, and the realizing inflation rate is 
represented by the purple dashed line at 2.73. The corresponding parameter estimates are a = 0.59, 
o\ = 0.98 and o<i = 1.30. The densities corresponding to the SPF panel and the PNC forecast are 
indicated accordingly. 

SPF and PNC covariates, and the GM technique with variance adjustment, for the period 
from 2006 on. Through 2009, the methods yield very similar predictive distributions, while 
their spread diverge for the period from early 2010 on. This goes hand-in-hand with the 
growing discrepancy between the median of the SPF and the PNC ensembles. 

In the performance comparison below, we include a probabilistic forecast with condi- 
tional predictive CDF given by 

where MSE S pf denotes the mean squared error of the SPF median over the past 40 quarters. 
The Federal Reserve Bank of Philadelphia has been experimenting with this method (Stark, 
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Figure 5: Recent forecasts of the United States inflation rate at a prediction horizon of five quarters. 
The point forecasts shown are the medians from the SPF panel data (blue line), the PNC ensemble 
(red line), the HR method with SPF and PNC covariates (orange line), and the GM technique with 
variance adjustment (green line). The 80% prediction intervals for the HR method are indicated 
in orange, the 80% prediction intervals for the GM technique are indicated in pale green, and the 
large purple dots represent the observed inflation rates. Furthermore, the point forecasts made by 
the individual SPF forecasters with ID 463 (magenta line) and ID 483 (cyan line) are shown. Note 
that several of the forecasts are identical from early 2010 on. 



2010, Section 3.2), and we refer to it as the SPF median with MSE technique. 

Tables [4] and [5] compare the predictive performance of the postprocessing methods to 
that of the original SPF and probabilistic no-change forecasts, in terms of the MAE and 
the mean CRPS. In nowcasting at a prediction horizon of one quarter, none of the post- 
processing techniques outperforms the SPF forecast, neither in terms of the MAE nor the 
mean CRPS. At prediction horizons of two to five quarters, original and postprocessed SPF 
forecasts as well as the probabilistic no-change forecast all yield about equal MAE. Post- 
processing generally improves the mean CRPS, reaching a predictive performance that is 
comparable, but not superior, to that of the probabilistic no-change forecast. 
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Table 4: Mean absolute error (MAE) for forecasts of United States inflation from the third quarter 
of 1995 to the first quarter of 2010, in percentage points, along with the lower tail probability of the 
Diebold-Mariano test under the hypothesis of the predictive performance being equal to that of the 
SPF forecast. The probabilistic no-change forecast (PNC) is obtained using 20 quarters of data, and 
the length of the training period for all postprocessing methods is 40 quarters. 



Forecast method Prediction horizon in quarters 





1 


2 


3 


4 


5 


SPF 


0.89 


1.44 


1.51 


1.49 


1.49 


Probabilistic no-change 


1.45Q9 


1.46 58 


1.45i 8 


1.48 39 


1.48 44 


HR with SPF covariates 


0.91 60 


1.4236 


1.50 44 


1.5363 


1.49 50 


HR with SPF and PNC covariates 


0.87 44 


1.40 27 


I.5I49 


1.47 40 


1.48 5 i 


GM 


1-0599 


1.43 47 


1-45 18 


1-4736 


1.47 37 


GM with variance adjustment 


0.89 91 


1-4769 


1.5034 


1.4839 


1-45 16 



Table 5: Mean continuous ranked probability score (CRPS) for forecasts of United States inflation 
from the third quarter of 1995 to the first quarter of 2010, in percentage points, along with the lower 
tail probability of the Diebold-Mariano test under the hypothesis of the predictive performance being 
equal to that of the SPF forecast. The probabilistic no-change forecast (PNC) is obtained using 5 
years of data and the length of the training period for all postprocessing methods is 40 quarters. 



Forecast method Prediction horizon in quarters 





1 


2 


3 


4 


5 


SPF 


0.69 


1.16 


1.25 


1.26 


1.27 


SPF median with MSE 


O.7O53 


I.IO39 


1-1536 


1-1434 


1.13 31 


Probabilistic no-change 


I.O899 


1.10 2 1 


l.lOoo 


l.lOoo 


l.Hoo 


HR with SPF covariates 


0.69 48 


I.O809 


1.15 08 


1.15l2 


1.13io 


HR with SPF and PNC covariates 


O.6840 


1.09is 


1.15l2 


1.14n 


1.12i 2 


GM 


0.80 96 


1.09i 7 


I.IO01 


l.lloi 


l.lOoo 


GM with variance adjustment 


0.68 36 


I.IO20 


l.Hoo 


1.12 00 


l.lOoo 
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Figure 6: MAE (solid lines) and mean CRPS (dashed lines) for forecasts of United States inflation 
from the first quarter of 2000 to the first quarter of 2010, as a function of the length of the rolling 
training period, at prediction horizons of one quarter (left) and five quarters (right). Results are 
shown for the SPF (blue), probabilistic no-change (red), HR with SPF covariates (yellow), HR with 
SPF and PNC covariates (orange), GM (dark green), and GM with variance adjustment (pale green) 
techniques. 

4 Robustness 

We have argued that the SPF median forecast outperforms simple no-change forecasts of 
United States inflation in current quarter nowcasts. However, at prediction horizons from 
two to five quarters ahead, probabilistic no-change forecasts have equal or higher skill than 
the SPF forecast, even after postprocessing. In making such claims, it is critically important 
to demonstrate the robustness of the results under changes in the details of the prediction 
experiment. 

An initial robustness check was done in Section [2[ where Figure [2] showed the perfor- 
mance of the probabilistic no-change forecast to depend little on the choice of the length of 
the rolling training period, thereby justifying that we fix it at 20 quarters. In this section we 
show that our key findings remain valid under changes in the length of the training period 
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for the postprocessing techniques, and we check whether they hold in smaller test periods, 
under distinct economic regimes. 

Figure [6] plots the MAE and the mean CRPS for our postprocessing methods in their 
dependence on the length of the rolling training period, at prediction horizons of one and 
five quarters ahead. Except for the probabilistic no-change forecast, all methods show 
substantially higher skill at the shorter prediction horizon. Furthermore, for all methods 
the mean CRPS is much lower than the MAE. The predictive skill of the postprocessing 
methods does not depend much on the length of the training period, and our choice of a 40 
quarter period seems reasonable. 

Next we assess the effect of the test period and the corresponding economic regimes. 
In Tables [6] and [7J the aggregate results in Table s[T] and [2} which cover the third quarter of 
1995 through the first quarter of 2010, have been stratified into three sub-periods of about 
equal length. With shorter test periods, the Diebold-Mariano test statistic is occasionally 
ill defined because of a negative variance estimate. Diebold and Mariano (1995 ) suggest 
that the variance estimate should then be treated as zero and the null hypothesis of equal 
forecast accuracy be rejected. 

The first sub-period ranges from the third quarter of 1995 through the fourth quarter of 
2000. This was an era of general economic boom in the United States and inflation rates 
were particularly stable, which facilitated forecasting and is mirrored in low MAEs. Hence, 
the traditional no-change forecast performed quite well during this period, particularly at 
prediction horizons of three and four quarters, where it had lower MAE than all other 
forecasts, including the SPF and the probabilistic no-change forecasts. Another look at 
Figure [T] supports this choice, because the behavior of inflation rates in the late 1990s 
appears generally compatible with a symmetric random walk model, and the traditional no- 
change forecast is the Bayes predictor under such an assumption ( Granger] 1 1969^ [Gne iting , 



2010). Thus, during this period of sustained economic growth the choice of the traditional 
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no-change forecast as reference standard was appropriate. 

The second sub-period extends from the first quarter of 2001 through the fourth quarter 
of 2005. The expansion that began in the early 1990s came to an end at the beginning of 
this period with the early 2000s recession, and inflation rates started to behave much more 
erratically, not unlike a white noise model. The erratic behavior becomes even more pro- 
nounced during the third sub-period, which ranges from the first quarter of 2006 through 
the first quarter of 2010. This includes the late 2000s recession, a financial crisis much 
more severe than that of the previous period. Its effect culminated in the extreme defla- 
tion observed in the fourth quarter of 2008, at —9.2 percentage points, and we now see 
substantially lower levels of predictability than before. 

Generally, from 2000 on the probabilistic no-change forecast succeeds the traditional 
no-change forecast as the appropriate reference forecast. In line with the aggregate results 
in Tables [TJ and [2j at prediction horizons from two to five quarters the SPF forecast tends 
to benefit from postprocessing in terms of the mean CRPS, but not in terms of the MAE. 
We conclude that the spread adjustment generally is useful, but not the location adjustment. 
However, while the probabilistic forecast performance improves under postprocessing as 
measured by the mean CRPS, the postprocessed forecasts are unable to outperform the 
simplistic probabilistic no-change forecast. 

The sub-period from 2006 on also is the longest consecutive period for which a com- 
plete set of forecasts is available for more than one individual SPF expert. In Table [T] we 
show MAEs for the five professionals who had complete records during this period. The 
predictive performance of the experts with IDs 463 and 483 is particularly impressive, and 
their point forecasts are illustrated in Figure |5J 
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Table 6: Mean absolute error (MAE) for forecasts of United States inflation from the third quarter of 
1995 to the fourth quarter of 2000 (top panel), from the first quarter of 2001 to the fourth quarter of 
2005 (middle panel), and from the first quarter of 2006 to the first quarter of 2010 (bottom panel), 
in percentage points, along with the lower tail probability of the Diebold-Mariano test under the 
hypothesis of the predictive performance being equal to that of the SPF forecast. The NA symbol 
indicates a missing value due to a negative variance estimate. 



Forecast method 




Prediction horizon 


in quarters 




1 


2 


3 


4 


5 


1995:Q3 - 2000:Q4 


SPF 


0.52 


0.87 


0.93 


0.95 


0.95 


Probabilistic no-change 


0.87 99 


0.90 72 


0.93 49 


0.96 NA 


0.98 NA 


Traditional no-change 


0.76 99 


0.92 66 


0.78i4 


O.8833 


1.15 9 4 


HR with SPF covariates 


0.54 65 


0.84 37 


0.94 52 


I.OO57 


0.99 55 


HR with SPF and PNC covariates 


0.56 75 


0.88 56 


0.96 60 


1.03 6 2 


l.Olss 


GM 


0.59 99 


0.91 90 


0.94 67 


0.96 NA 


0.97 NA 


GM with variance adjustment 


0.52 99 


0.87 92 


0.93 17 


0.95 NA 


0.95 9 x 


2001 :Q1 - 2005:Q4 


SPF 


1.00 


1.26 


1.33 


1.26 


1.28 


Probabilistic no-change 


1.18 87 


I.I812 


1.17i 


I.I624 


1.13oi 


Traditional no-change 


1.65 99 


1.52 9 o 


1.74 97 


I.8699 


I.5899 


HR with SPF covariates 


1.1198 


1.20 03 


1.2522 




I.I800 


HR with SPF and PNC covariates 


1-1399 


1.21n 


1.25 25 


1.2237 


1.17 00 


GM 


1.0883 


1.17 10 


1.1608 


1.15 20 


1.14 i 


GM with variance adjustment 


1.0083 


1.24 00 


1.29 09 


1.19x3 


1.17n 


2006:Q1 - 2010:Q1 


SPF 


1.22 


2.39 


2.48 


2.47 


2.45 


Probabilistic no-change 


2.52 99 


2.5165 


2.46 45 


2.5270 


2.5569 


Traditional no-change 


3.36 99 


4.18 96 


3.88 90 


3.84 92 


3.72 94 


HR with SPF covariates 


1.1438 


2.43 60 


2.5366 


2.59 78 


2.52 77 


HR with SPF and PNC covariates 


1.00 20 


2.3338 


2.5157 


2.35oo 


2.51 62 


GM 


1.62 97 


2.43 56 


2.46 45 


2.526 9 


2.5265 


GM with variance adjustment 


1.22g4 


2.5273 


2.48 50 


2.527X 


2.44 48 


SPF forecaster ID 463 


1.06 16 


2.35 3 4 


2.57 75 


2.41x3 


2.33oo 


SPF forecaster ID 483 


2.12 99 


2.25 2 2 


2.30 15 


2.26 05 


2.33x9 


SPF forecaster ID 510 


2.32 99 


2.67 98 


2.48 50 


2.71na 


2.55g6 


SPF forecaster ID 516 


1.5383 


3.34 99 


3.03na 


2.78 9 o 


2.94 99 


SPF forecaster ID 528 


1-4068 


2.59 76 


3.00 98 


2.70 NA 


2.55go 
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Table 7: Mean continuous ranked probability score (CRPS) for forecasts of United States inflation 
from the third quarter of 1995 to the fourth quarter of 2000 (top panel), from the first quarter of 
2001 to the fourth quarter of 2005 (middle panel), and from the first quarter of 2006 to the first 
quarter of 2010 (bottom panel), in percentage points, along with the lower tail probability of the 
Diebold-Mariano test under the null hypothesis of equal predictive performance between the SPF 
forecast and the reference forecast. The NA symbol indicates a missing value due to a negative 
variance estimate. 



Forecast method 




Prediction horizon 


in quarters 




1 


2 


3 


4 


5 


1995:Q3 - 2000:Q4 


SPF 


0.39 


0.66 


0.73 


0.76 


0.77 


SPF median with MSE 


0.39 38 


0.59 04 


0.65i 2 


0.71 24 


0.72 30 


Probabilistic no-change 


0.59 99 


0.62 20 


0.63 i 


O.6600 


O.6801 


Traditional no-change with MSE 


0.55 98 


0.63 36 


0.57i 


0.7031 


0.86 8 i 


HR with SPF covariates 


0.40 62 


0.60 21 


0.66 26 


0.73 44 


0.74 44 


HR with SPF and PNC covariates 


0.42 76 


0.63 34 


O.6830 


0.75 48 


0.76 48 


GM 


0.43 98 


0.63 20 


O.6601 


0.67 00 


0.69 00 


GM with variance adjustment 


0.38 22 


0.58 00 


O.6I00 


0.64 00 


0.66 02 


2001 :Q1 -2005:Q4 


SPF 


0.82 


1.03 


1.07 


1.07 


1.07 


SPF median with MSE 


0.82 59 


0.94 i 


0.97 00 


0.93 00 


0.92 00 


Probabilistic no-change 


0.87 73 


O.8801 


0.89 00 


0.89 00 


0.87 00 


Traditional no-change with MSE 


1.1798 


1.2086 


1.2683 


1.4198 


1.15 72 


HR with SPF covariates 


0.85 78 


0.93 02 


0.97 07 


0.94 06 


O.8800 


HR with SPF and PNC covariates 


0.88 90 


0.94 03 


0.98 09 


0.95 06 


0.87 00 


GM 


0.81 44 


O.8801 


O.8600 


0.86 02 


0.83 00 


GM with variance adjustment 


0.80 21 


0.92 00 


0.94 00 


0.91 00 


0.85 00 


2006:Q1 - 2010:Q1 


SPF 


0.92 


1.98 


2.13 


2.15 


2.17 


SPF median with MSE 


0.95 n 


1.96 46 


2.02 32 


1.96 NA 


1.90na 


Probabilistic no-change 


1-9599 


1.98so 


1.94 09 


1.95 08 


1-95 NA 


Traditional no-change with MSE 


3.31 99 


3.54 98 


3.03 84 


2.87 85 


2.64 79 


HR with SPF covariates 


0.8639 


1.8729 


1.99 22 


1.95n 


1.95 NA 


HR with SPF and PNC covariates 


0.78 2 i 


1.8631 


1.9724 


I.9I13 


1.91na 


GM 


I.2697 


1.93 42 


1.96i5 


1.96i5 


1.95i 2 


GM with variance adjustment 


0.93 58 


1.99 5 2 


1.96i3 


1.99i2 


1.95 07 
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5 Conclusions 



Predicting inflation is important, and there are various ways of doing it, including forecasts 
from the Survey of Professional Forecasters (SPF) and simple no-change forecasts. In the 
extant literature, the traditional no-change forecast has served as a benchmark, to which 
more sophisticated techniques are to be compared. While this was appropriate during the 
late 1990s economic boom, in today's turbulent markets the equally simplistic probabilis- 
tic no-change forecast performs much better. To avoid spurious claims of predictability, 
we make a plea for the use of the probabilistic no-change forecast as a default reference 
standard in inflation forecasting. 

In current quarter nowcasting, corresponding to a prediction horizon of one quarter, 
economic expertise is persuasive, and the SPF professionals outperform all types of no- 
change forecasts. Businesses, organizations and the government are well advised to avail 
themselves of the SPF experts' short-term predictions. 

At prediction horizons beyond the current quarter, the probabilistic predictive perfor- 
mance of the SPF forecast, as measured by the mean CRPS, improves under statistical 
postprocessing. Supplementing the SPF median with MSE method, which has been used 
by the Federal Reserve Bank of Philadelphia, we have introduced heteroscedastic regres- 
sion (HR) and Gaussian mixture (GM) techniques for doing this. However, at prediction 
horizons of two or more quarters, even postprocessed SPF forecasts fail to outperform the 
simplistic probabilistic no-change forecast. 

While novel and potentially surprising in the specific context of the SPF and inflation 
rates, the result conforms with a general theme in the forecasting literature, in that sim- 
ple prediction methods tend to perform well, with overfitting being heavily penalized and 



subjective human expertise often being overrated. For example, Nelson (1972) showed 
that simple statistical methods can outperform complex economic simulation models, and 
a recent 20-year study argues persuasively that professional forecasters, who appear as ex- 
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perts in the media and advise governments, organizations and businesses, might not be 
better prognosticians than John Q. Public ( Tetlock[ 2005). In this light, we might be well 
advised to adapt to low levels of socio-economic predictability, where uncertainty reigns 



(Makridakis and Taleb 2009). 



Appendix 

Here we provide a closed form expression for the continuous ranked probability score ([T]) 
when the predictive distribution is the three-parameter two-piece normal distribution with 
CDF given by ([2]). A tedious but straightforward calculation shows that 



CRPS(F TPN ,?/) 



01 + 2 



0"! 



01 



- (y - m) 



+ 



J2_ V2a 2 (al - of) - (of + o\ 
(0i + 2 ) 2 



4a 2 



01 + 02 



02 



02 



) + 



02 



+ 



(01 + 02f 

2 \/20i(0i - 2 2 ) - (01 + 0!) 



101 + 02 J 



if y < A*, 



(8) 



if y>fi. 



Grimit et al. (2006 1 give a similar formula when the predictive distribution is a mixture of 



Gaussian components, which we use to compute the CRPS for the GM forecasts. 
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